A Support for Non { Uniform Parallel Loops andits

نویسنده

  • Salvatore Orlando
چکیده

This paper presents SUPPLE (SUPport for Parallel Loop Execution), an innovative run{time support for parallel loops with regular stencil data references and non{uniform iteration costs. SUPPLE relies upon a static block data distribution to exploit locality, and combines static and dynamic policies for scheduling non{uniform iterations. It adopts, as far as possible, a static scheduling policy derived from the owner computes rule, and moves data and iterations among processors only if a load imbalance actually occurs. SUPPLE always tries to overlap communications with useful computations by reordering loop iterations and prefetching remote ones in the case of workload imbalance. The SUPPLE approach has been validated by many experimental results obtained by running a multi-dimensional ame simulation kernel on a 64{node Cray T3D. We have fed the benchmark code with several synthetic input data sets built on the basis of a load imbalance model, and we have compared our results with those obtained with a CRAFT Fortran implementation of the benchmark.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SUPPLE: An efficient run-time support for non-uniform parallel loops

This paper presents SUPPLE (SUPort for Parallel Loop Execution), an innovative run{time support for the execution of parallel loops with regular stencil data references and non{uniform iteration costs. SUPPLE relies upon a static block data distribution to exploit locality, and combines static and dynamic policies for scheduling non{uniform iterations. It adopts, as far as possible, a static sc...

متن کامل

Loop Pipelining Algorithm for Non-uniform Increasing Dependency Loops

Using parallel processing systems to execute scientiic applications is one of the most common solutions for achieving more eecient computing performance. However, those applications may contain loops with non-uniform dependencies which may cause a compiler to produce sequential codes. This paper presents a new scheduling methodology for non-uniform dependency loops, considering a limited number...

متن کامل

An Optimized Three Region Partitioning Technique to Maximize Parallelism of Nested Loops With Non-uniform Dependences

There are many methods for nested loop partitioning exist; however, most of them perform poorly when they partition loops with non-uniform dependences. This paper proposes a generalized and optimized loop partitioning mechanism which can exploit parallelism in nested loops with non-uniform dependences. Our approach based on the region partitioning technique divides the loop into variable size p...

متن کامل

LOCALITY AND LOOP SCHEDULING ON NUMAMULTIPROCESSORSHui

An important issue in the parallel execution of loops is how to partition and schedule the loops onto the available processors. While most existing dynamic scheduling algorithms manage load imbalances well, they fail to take locality into account and therefore perform poorly on parallel systems with non-uniform memory access times. In this paper, we propose a new loop scheduling algorithm, Loca...

متن کامل

A Practical Scheduling Scheme for Non-Uniform Parallel Loops on Distributed Memory Parallel Machines

Loops without dependence8 among iterations are a rich source of paTalle&sm in many applications. Among these type8 of loops, non-uniform loops with vatiable execution times need eficient scheduling schemes to take advantages of the capabilities of parallel machines. In this paper, we present a global distributed control scheme (GDC) to schedule nonuniform loops on distributed memory parallel ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997